Mathematical Language Processing Project

نویسندگان

  • Robert Pagel
  • Moritz Schubotz
چکیده

In natural language, words and phrases themselves imply the semantics. In contrast, the meaning of identifiers in mathematical formulae is undefined. Thus scientists must study the context to decode the meaning. The Mathematical Language Processing (MLP) project aims to support that process. In this paper, we compare two approaches to discover identifier-definition tuples. At first we use a simple pattern matching approach. Second, we present the MLP approach that uses part-of-speech tag based distances as well as sentence positions to calculate identifier-definition probabilities. The evaluation of our prototypical system, applied on the Wikipedia text corpus, shows that our approach augments the user experience substantially. While hovering the identifiers in the formula, tool-tips with the most probable definitions occur. Tests with random samples show that the displayed definitions provide a good match with the actual meaning of the identifiers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards a Self-reflective, Context-aware Semantic Representation of Mathematical Specifications

We discuss a framework for the representation and processing of mathematics developed within and for the MOSMATH project. The MOSMATH project aims to create a software system that is able to translate optimization problems from an almost natural language to the algebraic modeling language AMPL. As part of a greater vision (the FMathL project), this framework is designed both to serve the optimi...

متن کامل

The DeLiVerMATH Project - Text Analysis in Mathematics

A high-quality content analysis is essential for retrieval functionalities but the manual extraction of key phrases and classification is expensive. Natural language processing provides a framework to automatize the process. Here, a machine-based approach for the content analysis of mathematical texts is described. A prototype for key phrase extraction and classification of mathematical texts i...

متن کامل

The Archaeotools project: faceted classification and natural language processing in an archaeological context.

This paper describes 'Archaeotools', a major e-Science project in archaeology. The aim of the project is to use faceted classification and natural language processing to create an advanced infrastructure for archaeological research. The project aims to integrate over 1 x 10(6) structured database records referring to archaeological sites and monuments in the UK, with information extracted from ...

متن کامل

Modified Pareto archived evolution strategy for the multi-skill project scheduling problem with generalized precedence relations

In this research, we study the multi-skill resource-constrained project scheduling problem, where there are generalized precedence relations between project activities. Workforces are able to perform one or several skills, and their efficiency improves by repeating their skills. For this problem, a mathematical formulation has been proposed that aims to optimize project completion time, reworki...

متن کامل

Natural Language Dialog with a Tutor System for Mathematical Proofs

Natural language interaction between a student and a tutoring or an assistance system for mathematics is a new multi-disciplinary challenge that requires the interaction of (i) advanced natural language processing, (ii) flexible tutorial dialog strategies including hints, and (iii) mathematical domain reasoning. This paper provides an overview on the current research in the multi-disciplinary r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1407.0167  شماره 

صفحات  -

تاریخ انتشار 2014